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1. INTRODUCTION 

Lung cancer is characterized by uncontrolled cell division in the lungs. Cells divide and replicate 
themselves. When damaged cells divide uncontrollably, they become tumors, which eventually hinder organs 
from functioning correctly. When compared to other kinds of cancer, lung cancer is by far the deadliest and 
most common form of the disease, making it the top cause of death globally [1]-[6]. Lung cancer is the world's 
second most common cancer, behind breast cancer. According to World Health Organization (WHO) figures 
issued in 2020, lung cancer mortality in Indonesia reached 28,633 in 2019. In 2022, the United States is 
expected to have 1,918,030 new cancer cases and 609,360 cancer deaths, with lung cancer, the leading cause 
of cancer death, accounting for approximately 350 deaths per day [7]. However, if nodules are found at an 
early stage, there is a chance that the human survival rate can be raised. 

Lung cancer detection methods using computer-aided detection (CAD) have been developed in recent 
years [8]—[16]. Early identification of lung cancer has been shown to both lower mortality rates and increase 
the likelihood of survival. Nodules in the lungs that are still relatively tiny can be benign or malignant, 
depending on the circumstances, even if the nodules themselves are not cancerous. Lung tissue that is benign 
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does not experience much growth, however lung tissue that is malignant will develop swiftly and assault the 
body, making it extremely hazardous to one's health [17]. The most important goals of computer-aided design 
(CAD) are for reliably recognizing images and extracting regions of interest (ROD) from images obtained from 
a variety of imaging modalities. These imaging modalities include computed tomography (CT) scans, X-rays, 
position emission tomography (PET), and magnetic resonance imaging (MRI) [18]-[27]. CAD systems are 
further subdivided into computer-aided detection (commonly abbreviated as CADe) and computer-aided 
diagnosis (CADd) (CADx). The CADe system's capabilities are limited to identifying abnormal tissue regions 
and pictures, but the CADx system can be used to diagnose a disease by determining the sort of abnormality 
present and whether it is malignant [28]. 

Previously, a substantial amount of study was conducted on the examination of lung nodule 
identification using the CADe system. Because of limited resources and enormous volumes of data, this system 
is often based on traditional machine learning, as well as linear discriminant analysis, multiple gray-level 
thresholding, distance transformation, and support vector machines (SVM) [29]-[33]. However, in recent 
years, a number of researchers have created deep learning-based lung detection techniques. One of these ways 
is the convolutional neural network (CNN) approach, It has exceptional computer vision performance values, 
and increases the CADe system's accuracy and sensitivity [34]. Trends for the years 2019-2022 include deep 
learning in the form of CNN as well as the performance of each individual approach. The objective of this 
study is to give researchers with an overview of the CNN-based CAD system for the identification of lung 
nodules. The review will cover the phases of the CAD system for lung nodule identification in general, the 
preprocessing, the segmentation and detection techniques that have been extensively utilized from the 
beginning to the present, as well as the approach of lung nodule identification that is now being utilized. 


2. LUNG NODULES 

Lung nodules are abnormal growths that develop within the lungs. Lung nodules are extremely 
frequent. The lung may have one or many nodules. Up to fifty percent of persons who have chest X-rays or CT 
scans possess them. Nodules can form in either lung. About 95% of lung nodules are benign (not cancerous). 
Infrequently, lung nodules indicate lung cancer. Because small lung nodules rarely cause symptoms, more 
testing are required to identify whether or not it is lung cancer. 

Analysis of lung nodules is one of the stages that must be taken in order to successfully prevent lung 
cancer, which is accomplished through identification and categorization. Dark-level lung nodules are typically 
between 3 and 30 millimeters in diameter [34]. In general, these nodules have a diameter of around 3 
millimeters. Figure | depicts samples of various types of lung nodule categories. The circumscribed, juxta 
vascular, juxta pleural, and pleural tails of nodules are represented in Figures 1(a) to 1(d), respectively. In 
contrast, juxta vascular nodules are firmly linked to blood vessels, while juxta pleural nodules are located in 
the region around the pleura [28]. Circumscribed nodules are not associated with any other tissue structures 
and are seen in a dispersed manner throughout the tissue. 


The nodule is centrally While placed centrally in the A substantial amount of the The nodule is located at the 
positioned in the lung and has lung, the nodule has nodule is connected to the pleural surface and is 
no link to the vascular. extensive linkages to pleural surface. connected by a thin surface. 
adjacent arteries. 
(a) (b) (c) (d) 


Figure 1. Categories of lung nodule samples [28], (a) well-circumscribed (b) juxta vascular, (c) juxta pleural, 
and (d) juxta pleural tail 
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3. COMPUTER-AIDED DESIGN FOR LUNG NODULE DETECTION 

There are five stages to the computer-aided design (CAD) for the lung nodule detecting system, which 
are as follows: i) image acquisition, ii) preprocessing, iii) lung segmentation, iv) nodule detection, and 
v) classification. A comprehensive schematic representation of the lung CAD process is shown in Figure 2. 
Medical images are acquired by a variety of imaging modalities, including CT scans, X-rays, and MRI images 
[35]. Medical images can be gathered from publically available image databases to give researchers with a 
source of data for CADe system research, development, testing, evaluation, and benchmarking. Table 1 shows 
several databases of lung scans that are open to the public and can be used by anyone. These include Japanese 
society of radiological technology (JSRT), early lung cancer action program (ELCAP), 
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Figure 2. Acquisition, preprocessing, lung segmentation, nodule detection, and classification are the steps in 
the CAD schematic for lung nodule detection 


Table 1. List of public lung image databases 


Public Databases Released Year __ Scans Modalities 

PLCO Trial 1993 52,320 CXR 

JSRT [36] 1998 247 CXR 

ELCAP [37] 2003 50 CT 

NELSON [38] 2003 15,822 CT 

NELSON [39] 2003 15,822 CT 

RIDER [40] 2004 48,698 CT 

ANODEO09 [41] 2009 55 CT 

NLST Trial [42] 2009 75,000+ CT 

Lung TIME [43] 2009 157 CT 

LIDC-IDRI [1] 2011 1,018 CT, CR, DX 

ILD [44] 2012 905 CT 

ACRIN-NSCLC-FDG-PET [45] 2013 3,377 PT, CT, MR, CR, DX, SC, NM 

NSCLC-Radiomics [46] 2014 1,265 CT, RTSTRUCT, SEG 

LungCT-Diagnosis [47] 2015 61 CT 

QIN Lung CT [48] 2015 47 CT 

LISS [49] 2015 271 CT 

LUNA 16/Ali Tianchi [50] 2016 888 CT 

Italung-CT [51] 2016 122 CT 

Kaggle Data Science Bowl [17] 2017 285,380 CT 

UniToChest [52] 2021 306,440 CT 
Note: JSRT: Japanese Society of Radiological Cancer Screening Trial 
CXR: Chest X-Ray Technology PLCO: Prostate, Lung, 


Colorectal and 


CT: Computed Tomography 

CR: Computed Radiography 

DX: Digital X-ray 

PT: Positron emission tomography (PET) 
MR: Magnetic Resonance 

SC: Secondary Capture 

NM: Nuclear Medicine 

RTSTRUCT: Radiotherapy Structure Set 
SEG: Segmentation 


ELCAP: Early Lung Cancer Action Project 
NELSON: Nederlands Leuvens 
Longkanker Screeningsonderzoek 
ANODE: Automatic Nodule Detection 
LIDC: Lung Image Database Consortium 
IDRI: Image Database Resource Initiative 
ILD: Intersitial Lung Disease 

LUNA: Lung Nodule Analysis 

Italung-CT: Italian Lung 
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Dutch-Belgian randomized lung cancer screening trial or in Dutch Nederlands—Leuvens Longkanker 
Screenings Onderzoek (NELSON), automatic nodule detection 2009 (ANODEO9), lung image database 
consortium and image database resource initiative (LIDC-IDRI), lung nodule analysis 2016 (LUNA16), 
Kaggle, and others. The lung ROI is segmented and noise is reduced during the preprocessing phase of the 
lung nodule CAD system. As a result, the prospecting range for lung nodules has been narrowed and the data 
have been normalized [10]. These include the weiner filter, the grayscale filter, the histogram equalization 
filter, the linear mapping filter, and the Gaussian lowpass filter [53], [54]. 

Lung segmentation is the process of separating a nodule from surrounding areas of a CT scan and then 
enhancing the resulting image to obtain more information about the nodule [55]. Accurate lung segmentation 
and a range of methods for obtaining lung volume from CT images are two indicators that reflect the efficiency 
of a system designed to identify lung nodules. Among them are quality thresholding, morphological 
procedures, the watershed approach, the active shape model (ASM), the active appearance model (AAM), and 
the faster region CNN (faster R-CNN) [50], [56]. 

The process of finding nodules in the lung that may progress to lung cancer is known as nodule 
detection [28]. Naive bayes, random forests, SVM’s, multi-layer perception (MLP), and CNN’s are some of 
the approaches used to identify lung nodules [1], [41], [57]. Because high sensitivity and poor accuracy values 
are frequently obtained during the nodule identification stage, a subsequent phase, termed false-positive 
reduction, is required [34]. The feature extraction method and nodule categorization with feature-based 
classifiers were engaged at this stage [41]. The form and texture are used to extract features. Shape qualities 
are measured using the geometric value of each structure (such as form proportion, density, roundness, 
elongation, weighted radial distance, and Boyce-Clark radial shape index); nodules are rounder than other 
tissue structures, thus look for the most spherical shape [58]. After the feature extraction step, use many 
supervised or unsupervised classifiers to detect the nodule and lower the FPs value [28]. In order to reduce the 
CADe system's false-positive rate, this FPs reduction stage focuses on recognizing true lung nodules from all 
suspected nodules and removing phony nodules. There will be four possible outcomes throughout the 
classifying procedure. A lung nodule that is correctly identified is referred to as a true positive (TP) or false 
negative (FN). If a lung nodule is identified wrongly, it is referred as as a "true negative" (TN) or "false 
positive" (FP) [59]. 


4. PREPROCESSING 

Preprocessing is an essential first step in the CT images used for lung detection because the raw CT 
images contain a great deal of noise and extraneous data that will hinder the CAD system's ability to detect 
lung nodules [34]. This review divides preprocessing into three stages: image smoothing, edge sharpening, and 
noise removal. The preprocessing stages will be explained briefly in this section, and the section will conclude 
with the possibility of using CNN-based methods to remove noise from lung images. 


4.1. Image smoothing 

Images are impacted by a number of factors, not all of which are related to the viewer's ability to 
visually perceive the image. In addition to this, they make it more difficult to recognize and differentiate 
between image features that are important for various applications, such as pattern recognition and image 
segmentation. Noise is one of these factors that occurs quite frequently, and it has the potential to have a 
significant impact on both the visual quality of images and the performance of the majority of image processing 
tasks. It is due to inaccuracies that occurred during the image acquisition process [60]. 

Images are frequently captured despite the fact that the conditions are not optimal; for example, there 
may be insufficient light, excessive clarity, or unfavorable weather. The inability to acquire an image due to 
transmission errors, problems with networked cables, signal disturbances, problems with sensors, and so on 
can be caused by equipment of inferior quality. Therefore, the pixel intensity values do not accurately reflect 
the true colors that were captured during the real acquisition process. Because of these factors, a wide variety 
of techniques have been developed in order to retrieve lost image information and improve the details of 
images. Image smoothing is a technique that is included in preprocessing techniques and is used to remove 
possible image perturbations without losing any of the image's information [60]. Image smoothing is used to 
define smooth, consistent borders for segmented lungs near the mediastinum. As a result, smoothing operations 
must be restricted to the region around the mediastinum so that the contour of the lung in areas other than the 
mediastinum is not affected [61]. 

Some smoothing methods include restricted cubic splines (RCS), penalized splines (P-splines), natural 
splines (NS), and fractional polynomials (Fracpoly). The RCS is a cubic regression spline with continuous first 
and second derivatives at the knots for visual smoothness. RCS are further limited to be linear above the last 
knot and below the first. The linearity in the tails allows for a more compact model. P-splines were fitted using 
R's standard software implementation. By modeling the smooth function, P-splines provide an approach to 
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determining optimal smoothing via degrees of freedom that is relatively robust to the choice of location and a 
relatively large number of knots. The natural spline is simply a constrained cubic spline that employs B-spline 
basis functions instead of piecewise polynomials. Fractional polynomials, like P-splines and RCS, can be 
employed with any generalized linear model to analyze survival data. Although a global (rather than local) 
approach, the FP model has the advantage of being a simpler form than the other two possibilities and 
incorporating a wider range of functional forms than the normal polynomial family allows. A broader range of 
possible dose-response interactions could be accommodated [62], [63]. The mean filter, the Gaussian filter, 
anisotropic diffusion, the median filter, the adaptive median filter, conservative smoothing, and the alpha trim 
mean filter are other image smoothing methods that can be used to improve low-level distortion in lung images 
[37], [53]-[55], [58], [64]. 


4.2. Edges sharpening 

Depending on the image domain, the sharpening methods can be divided into two groups: spatial- 
based and frequency-based methods. In the first scenario, we work directly with the pixel, whereas in the 
second, we work with the image's transform coefficients (Fourier or wavelet). Only when we recover the image 
using the inverse transform can we notice the effect of the alteration. The alteration of pixel values is the 
foundation of spatial domain techniques for sharpening images. Enhancing the contrast between various 
elements of the image is one method to make it better. There are various techniques for sharpening images in 
the spatial domain. Histogram equalization is one of the most well-known techniques (HE). Contrast stretching 
(CS), which is based on altering the dynamic range, or the range between the minimum and maximum intensity 
values of the image's gray levels, is another well-known method within the field of spatial domain sharpening. 
The simplest contrast stretch algorithm, linear contrast stretch (LCS), stretches the pixel values of a low- or 
high-contrast image by stretching the dynamic range across the entire image spectrum. This method's loss of 
certain details due to saturation and clipping is one of its drawbacks [1], [53], [55], [60], [65]. 

Frequency domain approaches rely on transformations such as the discrete Fourier (or cosine) 
transform or wavelet transforms. Each of these strategies is not unique; in fact, they are part of a family of 
methods that are fundamentally the same yet varies slightly from one another. They operate as follows: First, 
we apply one of these transformation methods; then, we process the changed image using one of these methods; 
and finally, the inverse transformation of the processed image yields the output. This method has a significant 
advantage: the ability to discern various parts in an image. Higher frequencies correspond to image edges or 
features, while lower frequencies belong to image smoothness. This simple split enables the image to be 
processed correctly depending on the aim. However, this also implies that we are processing details from 
multiple locations indistinguishably at the same time. This also occurs in smooth areas. In recent years, wavelet 
theory has emerged as a powerful image processing tool. This approach offers us with image spatial and 
frequency information. Adding high-pass or eliminating low-pass filtered versions from the image might 
improve the image. One of the early works on contrast sharpening in the wavelet domain was applying a 
parametrized hyperbolic function to the gradient of the wavelet coefficients. Since then, a great deal of work 
has been done in the wavelet domain. Loza et al., for example, suggested a non-linear augmentation strategy 
based on the local dispersion of wavelet coefficients. This technique adaptively improves image contrast 
depending on local characteristics of the image's wavelet coefficients. A contrast enhancement technique based 
on scaling the internal noise of a dark image in the discrete cosine transform (DCT) domain. It is based on a 
physics concept known as "dynamic stochastic resonance" (DSR), which employs noise to increase the 
performance of a system [60]. 

The traits of smoothing and sharpening are diametrically opposite. As a result, there are few solutions 
to meet both goals concurrently, either jointly or independently. The first strategy is to process the image in 
two steps: first by performing one operation, and then by applying the second procedure on the produced image. 
The order in which we do the operations has a considerable influence on the result in this situation. Sharpening 
before smoothing may increase the significance of image noise, complicating the smoothing process. 
Smoothing before sharpening, on the other hand, risks losing information that the sharpening technique cannot 
recover. Although the second technique yields better outcomes in general, it is not the best option. As a result, 
in recent years, techniques capable of combining smoothing and sharpness have been presented [60]. The 
methods of smoothing and sharpening known as contrast limited adaptive histogram equalization (CLAHE) 
[66] and anisotropic filtering introduced by Perona and Malik (PM) [67] have been combined simultaneously 
by means of a synchronization algorithm, and the improvement in comparison to the corresponding two-step 
methods that are based on them can be seen here. The technique makes use of the benefits offered by these 
distinct models and combines them with the intention of developing a powerful instrument for the creation of 
medical images, more specifically for lung images. 
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4.3. Noise removing 

Image denoising processes eliminate noise from images and restore them to their original state. How 
to discern between noise, edge, and texture is a fundamental issue in image denoising. Image denoising 
techniques are utilized in medical imaging, remote sensing, military surveillance, biometrics and forensics, 
industrial and agricultural automation, and individual recognition. Denoising algorithms are important pre- 
processing processes in medical and biomedical imaging that are used to remove medical noise such as speckle, 
Rician, quantum, and others [68]. 

The mean filter (MF), the adaptive mean filter (AMF), and the bilateral filter are three examples of 
the many algorithms that can be used to remove or reduce noise (BF). The mean filter (MF) is the simplest of 
the three, the AMF is a refinement of the mean filter, and the background subtraction filter (BF) is considered 
a state-of-the-art noise reduction technique. The BF compresses the filter range as well as the filter domain into 
a particular window size. The BF is a non-iterative adaptive smoothing filter that reduces noise while 
maintaining the edges of the objects contained within an image. The BF is comprised of three variables, which 
are as follows: the window size, the filter range, and the filter domain. Because of this, the camera is able to 
preserve a significant amount of the image's fine detail as well as its texture. However, the BF technique 
requires a considerable amount of computation and is mathematically complex; consequently, the development 
of a simple and quick algorithm for noise reduction that maintains excellent image quality would be very 
significant [69]. 

The majority of the filters listed above have generated results that are considered to be of a satisfactory 
quality; nonetheless, each of these filters has a few deficiencies. These disadvantages include insufficient test 
phase optimization, the need for manual parameter settings, and particular denoising models. The adaptability 
of CNN’s has, fortunately, demonstrated that it is possible to overcome these shortcomings [70]—-[73]. An 
overview of CNN's image denoising procedures is presented in Figure 3. It is intended that the explanations 
contained within this study will make it possible to comprehend the CNN architectures that are utilized in 
image denoising [68]. 
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Figure 3. CNN-based methods for removing noise for lung images [68] 
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5. LUNG SEGMENTATION 

Lung segmentation is a procedure that is used in lung detection. This process involves isolating lung 
nodules from other portions of the lung CT scan image and then further improving the image that is produced 
so that detail can be seen [55]. Lung segmentation techniques are broadly classified into four categories: 
i) deformable boundary-based techniques; ii) edge-based techniques; and iii) threshold-based techniques, and 
iv) registration-based method [28], [74]-[77]. Each of these categories has its own subcategories. Several of 
these methods of segmentation each have their own set of benefits and drawbacks. This review divides lung 
segmentation into three stages: histogram-based thresholding, connected component analysis, and lung 
extraction. The stages will be explained briefly in this section, and the section will conclude with the possibility 
of using CNN-based lung segmentation methods. 


5.1. Histogram based thresholding 

Because of its ease of use, the global histogram of a digital image is a popular tool for real-time image 
processing. It provides an important foundation for statistical techniques in image processing by giving a global 
description of the image's information. A pixel's color in a color image using RGB representation is a blend of 
the three primitive hues red, green, and blue. Each image pixel may be seen as a three-dimensional vector with 
three components representing the three colors of each image pixel. As a result, the global histograms 
representing the three primitive components may yield global information about the entire image. Histogram- 
based thresholding is a common segmentation approach that looks for peaks and troughs in the histogram. A 
standard segmentation strategy based on histogram analysis can only be carried out if the dominant peaks in 
the histogram can be accurately identified. Several frequently used peak-finding methods assessed the 
sharpness or area of the peak to determine the dominant peaks in the histogram. Although these peak-finding 
techniques are beneficial in histogram analysis, they may not always function effectively, especially when the 
image contains noise or radical change [78], [79]. 

In the research carried out by Filho et al. [58], the researchers decided to utilize a threshold value of 
90 since they found that it was effective in the identification of lung nodules. The use of threshold approaches 
in lung segmentation helps make it simpler to locate nodules and reduces the amount of lung tissue that does 
not contain a nodule [1]. In the process of lung segmentation, in addition to thresholding techniques, there are 
also watershed techniques [54], morphological operations [65], active contour modeling (ACM) [56], active 
appearance modeling (AAM) [53], active shape modeling (ASM) [36], and fissure region segmentation [37]. 
Widodo et al. [53] came up with the idea for the AAM methodology, which is a statistical learning method that 
models parameters to characterize form differences across classes and variation in texture. In order to carry out 
the covariance matrix eigen analysis of the training vector in a way that is consistent with the shape and texture 
of the training image, the principal component analysis (PCA) is utilized. The AAM model as a whole is 
composed of three subcomponents, which are as follows: i) alignment of the shape data; ii) creation of a 
parameter model based on statistical and real data; and iii) template matching. The findings that were acquired 
from this segmentation were able to distinguish the lungs from the other tissues that were adjacent [53]. An 
active contour model was integrated with the field formulation of the locally biased image by using the ACM 
approach that was suggested by Kasinathan et al. [56] for the segmentation of lung tumors (ACM). In order to 
reconcile properly homogenous CT images and quickly split the tumor zone with an inhomogeneous intensity, 
the mean square error was utilized. The LIDC-IDRI database was utilized to test the suggested ACM approach, 
which is comprised of 850 images of the lesion and is able to reliably locate lung tumors in CT scans. The 
results of the testing were positive. 


5.2. Connected component analysis (CCA) 

Connected component analysis (CCA) also known as connected component labeling, blob extraction, 
or region labeling is a graph theory-based approach for determining the connectedness of "blob"-like regions 
in a binary image. Connected component analysis is frequently utilized in the same contexts as contours are 
used; however, connected component labeling can often provide more granular filtering of blobs in a binary 
image. The outline hierarchy is commonly a constraint when employing contour analysis (i.e., one contour 
contained within another). We can more readily segment and examine these structures with connected 
component analysis. The primary purpose of CCA is to extract related components in a binary image and 
synthetic data, such as area, bounding boxes, center of gravity, and so on, which will then be further processed 
based on the application. CCA is traditionally implemented as a mixture of two following computations: 
connected component labeling (CCL) and feature computation (FC). The CCL procedure distinguishes various 
connected components in the input binary image by assigning a unique label to all pixels that belong to the 
same connected component. The FC algorithm then creates these labeled images in order to obtain one or more 
of the above-mentioned synthetic data parameters required by the subsequent phases [80]. 
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The labeled connected component in the binarized image is analyzed using connected component 
analysis. The pixels in the binary image are divided into multiple linked components based on their pixel 
connection. The lung lobes are segmented using the first-level linked components. The designated components' 
area and boundary information are utilized to determine the choice. The second level of linked components is 
identified, and geometric characteristics are retrieved from each component. The geometric characteristics used 
include area, bounding box, eccentricity, equivalent diameter, main axis length, minor axis length, and 
perimeter. The information gathered from the components is evaluated to identify whether or not they are 
tumors. An overall evaluation of lung nodules is provided in [81]. A lung nodule greater than 5 mm in size has 
a significant likelihood of becoming malignant cancer. According to their findings, an eccentric component 
with an overall size of 5 mm or greater is likely to be malignant. The findings of these tests are utilized to label 
the section as malignant and remove the remainder [82]. 


5.3. Lung extraction 

The purpose of lung extraction is to identify the thoracic wall and mediastinum voxels, allowing the 
subsequent stages of work to focus entirely on the region that forms the lung parenchyma. This is accomplished 
once more through the use of the region-growing algorithm. Occasionally, the lung extraction stage incorrectly 
removes certain voxels from the pulmonary parenchyma. These errors can result in the exclusion of all potential 
nodules, resulting in a detection error. As a result, the rebuilding stage is critical for the preservation of 
peripheral nodules. In order to perform the reconstruction of the incorrectly eliminated lung outline, a previous 
knowledge about the object which is being segmented is used. The lung is a well-known organ with a gentle 
contour and no re-entrances. As a result, any hole or abrupt discontinuity observed on its outline is a clear 
indication of perimeter collapse and must be repaired [83]. 

Kuruvilla and Gunavathi [65] proposed a method for lung segmentation that makes use of 
morphological procedures on CT images. These operations are carried out by transforming grayscale images 
to binary images. The speed and user-friendliness of the morphological operation approach are two of its 
defining characteristics. Although Jayaraj and Sathiamoorthy [54] were the ones who initially introduced the 
watershed segmentation approaches, its most distinguishing feature is the capacity to isolate and identify items 
that are in close proximity to the image. This paradigm of mathematical morphology is built on the concept of 
regions. It is a kind of apparent image decomposition that assigns each pixel to a region or watershed. Deep 
learning, namely the faster convolutional neural network, was proposed by Huang ef al. [50] as a method for 
performing the segmentation procedure for identifying lung CT images. It employs five layers of convolution 
at a rate that is lower than usual because to the improved resolution and increased segmentation precision it 
possesses. The suggested fully convolutional network (FCN) method was tested on the LIDC-IDRI database, 
and the results showed that it had an accuracy value of 94.6%. When separating CT images from the LIDC- 
IDRI database in 2018, Chunran and Yuanyuan [84] employed a FCN. 

The regression neural network (RNN) segmentation approach was proposed by Messay ef al. [85]. 
They developed a system that is completely automated (FA), a system that is semi-automated (SA), and a 
system that is hybrid. The FA and SA systems are what give rise to the hybrid system. These systems then 
yield many parameters, which are subsequently decided in an adaptive manner for each nodule by use of RNN. 
Additionally, the RNN method was presented by Sankar and George [86] in the year 2020. They are employing 
RNNs in an effort to enhance the identification of juxtapleural and juxtavascular lesions. Their suggested RNN 
technique performs better than the skeleton graph cut method and the level set method, which are both used to 
recognize lesions with the same intensity level. UNet and CNN are two approaches that were suggested by 
Shaziya and Shyamala [87] for segmenting lung CT images. According to the results of the dice similarity 
coefficient (DSC), the UNet approach is 1.27% more effective than CNN when it comes to image segmentation. 
In addition, Arora et al. [88] completed segmentation using a total of 662 chest X-ray (CXR) images using the 
UNet approach. When they segmented the lungs of TB patients, they found that the DSC value was 0.9680. 

In point of fact, performance segmentation based on a rules-based approach is exactly the same as 
performance segmentation based on a data-based approach. Nevertheless, in order to train the learning model, 
a data-based method takes a significant amount of time, and the associated computational costs will be higher 
than those associated with a rule-based approach to CAD system optimization. In order to make researchers 
feel more at ease when using a rule-based approach to the processing of lung CT images, a rule-based approach 
can be accomplished by altering the manual settings of a data-based method [34]. As a consequence of this, 
the techniques of thresholding, FCN, RNN, and UNet are the most effective ways of segmentation for 
performing image detection tasks. For more details, the lung nodule preprocessing and segmentation techniques 
are shown in Table 2. 
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Authors Year Databases Scans Nodules size Preprocessing Methods Segmentation methods 
Kuruvilla and 2014 LIDC-IDRI 155 260-400 mm Gray scale to binary image Morphological 
Gunavathi [65] operations 
De Carvalho Filho 2014 LIDC-IDRI 640 3mm Contrast, gaussian filter Quality threshold 
et al. [58] and median filter clustering and region 
growing 
Messay et al. [85] 2015 LIDC-IDRI 456 4.21-31.62 mm A semi-automated (SA) RNN 
system, a fully-automated 
(FA) system, and a hybrid 
system 
Chen et al. [89] 2015 LIDC-IDRI 1010 8 mm Resizing and Z-score CNN and deep belief 
normalization using mean networks (DBN) 
and standard deviation 
Widodo et al. [53] | 2017 Private 120 0.5-10 mm Histogram equalization and AAM 
gaussian lowpass filter 
Jiang et al. [1] 2018 LIDC-IDRI 1006 3mm Brighnest, position, shape, Local threshold 
and repair of the lung segmentation 
contour for juxta-pleural 
nodules. 
Chunran and 2018 LIDC-IDRI 1010 3-30 mm Original images FCN 
Yuanyuan [84] 
Gong et al. [41] 2018 LUNA1I6 and 1079 0.5—2 mm OTSU thresholding Segmentation of 3D 
ANODEO09 levels and local image 
characteristics 
Ausawalaithong et 2018 JSRT, NIH 247 2048x2048 and Increasing contrast, noise CNN 
al. [90] and 1024x1024 removal, image resizing 
100 pixels and image normalizing 
Anitha and Babu =. 2019 LIDC and 50 and 0.625 mm Morphological Fissure regions 
[37] ELCAP 30 transformation and weiner segmentation 
filter 
Kasinathan et al. 2019 LIDC-IDRI 850 0.45-0.75 mm Remove the mediastinum ACM 
[56] region and thoracic wall 
Huang etal. [50] 2019 LIDC-IDRI 888 0.6-5.0 mm Linear mapping FCN 
and LUNA16 
Ardila etal. [91] 2019 LIDC, LUNA, 1139 8-15 mm CNN CNN 
and NLST 
Li et al. [36] 2020 JSRT 247 17.3 mm Rib suppression CNN 
Sankar and 2020 LIDC-IDRI 1018 3mm Gaussian filtering RNN 
George [86] 
Shaziya and 2020 private 267 128x128 pixels Data augmentasi are crop, CNN and UNet 
Shyamala [87] zoom, rotate and flip 
Arora et al. [88] 2021 NLM-China 662 15 mm CLAHE method UNet++ 
CXR 
Nazir et al. [92] 2021 LIDC-IDRI 4682 3-30 mm Laplacian pyramid (LP) CNN 
sparse vector fusion 
Osadebey et al. 2021 LIDC-IDRI 1100 3-30 mm CNN CNN 
[93] 
Chavan et al [94] 2022 Shenzhen and 800 256x256 CNN CNN 
Montgomery 
Tandon et al [95] 2022 LIDC-IDRI 1018 3mm Data augmentation CNN 


methods (rescaling, 
rotation, horizontal and 
vertical flip) 


6. LUNG NODULE DETECTION 


The process of detecting objects in the lung tissue that are assumed to be nodules is referred to as 
"candidate nodule detection," which is also the name of the phrase. This detection stage is carried out after the 
lung segmentation stage. Lung segmentation is useful for lowering the burden of detecting CT input pictures 
because the background and undesired areas have been removed prior to this stage. It is possible to recognize 
lung nodules by the application of a number of distinct methods, such as random forest, SVM, naive bayes, k- 
nearest neighbor (k-NN), and CNN. The CNN system is comprised of several distinct designs, the most 
prominent of which are the CNN, a quicker R-CNN, a 3D CNN, and an R-CNN. 

Using random forests and the 10-fold cross method, Gong et al. [41] were able to identify lung nodules 
for CAD systems. The created CAD system was validated using two datasets, specifically LUNA16 and 
ANODEDO9, respectively. Detection of lung nodules was carried out by De Carvalho Filho and colleagues using 
the SVM method. The SVM is a cutting-edge algorithm that is derived from the Vapnik-Chervonenkis theory. 
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The level of accuracy achieved by the SVM relies on the selection of kernel parameters such as C and the radial 

basis function (RBF). For the testing, the LIDC-IDRI database was utilized, and in total, 140 new exams were 

developed [58]. The Naive Bayes and KNN methods were proposed for the purpose of lung nodule detection 
by Nobrega et al. [57]. The builder for these methods was based on a Gaussian distribution of the probability 
density function. 

When attempting to detect lung nodules using machine learning techniques, the difficulty of defining 
and choosing the attributes of a certain image arises. As the number of photographs in each category rises, the 
process of extracting features from images gets increasingly time-consuming and labor-intensive [34] In 
addition, over the past few years, methods based on deep learning have been developed for the detection of 
lung nodules. The CNN method was developed by a group of researchers specifically for the purpose of 
detecting lung nodules. In its most basic form, a CNN is made up of three layers: a convolutional layer (CONV), 
a pooling layer, and a fully connected layer (FC). Jiang et al. [1] conducted research on the detection of lung 
nodules using the CNN method, modifying it to use the function of rectified linear units in place of the 
convolutional layer Rectified Linear Unit (ReLU). When compared to other activation functions on the CNN 
method, the process of error training is sped up by using ReLU, which is one of the advantages obtained from 
using it. At the pooling layer, operations known as max-pooling and average-pooling are carried out, whereas 
the FC layer is made up of four separate channels. assessment of the suggested CNN approach using the LIDC 
IDRI database. The CNN method, which consists of several layers, is utilized by Wang et al. [96] in their lung 
nodule detection process. The activation function of the convolution layer is Leaky ReLU, the pooling layer 
uses means and averages, and the final layer, the FC layer, uses global average pooling. The FC layer 
implements a feature mapping strategy that makes use of a 4 by 4 matrix kernel in an effort to cut down on the 
number of connected parameters. In addition, the batch normalization layer is implemented to lessen the impact 
of overfitting and speed up the convergence of the network [96]. Kasinathan et al. [56] and Li et al. [36] carried 
out yet another study for the purpose of detecting lung nodules using the CNN method. The CNN method is in 
the process of being developed, and it currently incorporates architectural models such as region-based fully 
convolutional networks (RFCN) [97], regional-CNN (R-CNN) [98], faster regional CNN (Faster R-CNN) [50], 
ResNet50 [57], and 3D-CNN [99]. 

Deep learning is crucial for enabling the CNN approach to be employed in the analysis of medical 
pictures as technology develops, the detection process gets computationally faster, and the amount of data 
accessible increases. This was made possible by the use of the CNN method. The CNN method has a number 
of benefits, the most notable of which are an enhanced image detection performance, high flexibility and 
adaptability to a wide variety of datasets, and the capacity to be designed automatically and effectively by 
making use of black-box operations [34]. According to the findings of previous research, the following are 
some advantages of deep learning: 

— The performance of the CAD system in detecting nodules in lung cancer may be improved through the 
use of techniques from deep learning. Not only does the CAD system detect the presence of lung nodules, 
but it also provides information on the location of those nodules and has the ability to categorize detected 
nodules as either benign or malignant [97]. 

— Deep CNN has the potential to increase the sensitivity of the detection of lung nodules by reducing the 
value of FPs/scan, thereby lowering the error rate in detection and, of course, improving the quality of 
detection [99]. 

— Deep CNN can be used to detect lung nodules in a variety of dataset sources. For example, it can be used 
to detect data from hospital A, and then it can be used to detect data from hospital B. Deep CNN is able 
to discover various CT scans of the lungs and categorize them into distinct groups [37], [41], [50], [100]. 


7. CLASSIFICATION: FALSE POSITIVE REDUCTION 

Following the step of candidate nodule detection, the image is then further classified as either 
containing nodules or not containing nodules. This part of the process is referred to as the False Positive (FPs) 
reduction stage. The FPs reduction process can be broken down into two distinct categories: the first is the 
traditional feature-based classification, and the second is the CNN-based classification. In conventional 
feature-based classification, feature extraction and the detection of nodule candidate nodules are both utilized. 
There have been a few different approaches to feature extraction and candidate nodule detection that have been 
suggested. The following is a review of some of the publications that pertain to the aforementioned two 
classification stages for CAD systems in lung images. 

The SVM method was proposed by Filho et al. [58] for the purpose of classifying lung images. This 
method utilizes data obtained from the LIDC-IDRI database as well as a feature extraction process that makes 
use of shapes and textures. With a FPs/scan value of 0.008 and a free-response operating characteristic (FROC) 
value of 0.8062, the findings indicated that the test had an accuracy of 97.55%, a sensitivity of 85.91%, anda 
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specificity of 97.70%. Additionally, the test had a sensitivity of 85.91%. Han et al. [101] and Boroczky et al. 
[102] carried out yet another investigation in which the SVM method was utilized. In this particular 
investigation, the classification performed by feature-based SVM was dependent on categorical rules. These 
categorical rules took the form of geometric or shape features, intensity features, gradient features, and 
eigenvalue-based features. The proposed system was tested on 205 patient cases taken from the openly 
accessible online LIDC database. The experimental results obtained a sensitivity of 89.2% at 4.14 FPs/scan, 
and the system was found to be successful [101]. The results of the study showed that the genetic algorithm 
method developed by Boroczky et al. for extracting features from private data obtained from lung CT scans 
(including 52 real nodules and 443 fake ones) was 100% accurate and 56.4% accurate [102]. 

Jia et al. [103] proposed several three-dimensional methods for lung detection, some of which are 
surface shadow display (SSD), volume rendering, maximum intensity rendering (MIP), and minimum intensity 
rendering (MIP) (VR). The process of feature extraction uses ROI to more accurately identify suspicious 
regions, paying particular attention to their shape, gray value, position, circularity, mean value of gray level, 
and smoothness. Also taken into consideration is the mean value of gray level. With a sensitivity of 95% on 
FPs/scan of 0.91, the proposed method was able to successfully detect lung image nodules. Tariq and Akram 
[55] investigated the use of the neurofuzzy method in the detection of lung nodules. The neurofuzzy 
classification can be broken down into two distinct sub-networks: the fuzzy self, which is responsible for 
managing the network, and the multilevel, multilayer perception (MLP). In order to generate a pre- 
classification vector, the feature vector is used as an input to the fuzzy layer. This pre-classification vector is 
then assigned to the MLP as a sample test classification. The fuzzy self-layer network is in charge of locating 
nodule pixels and organizing them into groups according to the similarity of the nodules (regardless of whether 
or not there are nodules), but with varying membership values. In addition to this, the MLP network will 
classify the input vectors that have been applied in order to select candidates from the appropriate category. 
The testing was done with a total of one hundred datasets of lung CT images taken from various patients. The 
accuracy reached using the strategy that was suggested is 95%. The back propagation neural network method, 
which is composed of three layers, was utilized by Talebpour et al. in the process of classifying nodules as 
distinct from other objects that are not nodules. There are 22 input neurons in the first layer, five hidden neurons 
in the second layer, and one neuron in the output layer of the third layer. The first layer is considered the input 
layer. Tan-sigmoid is an internal function that is present in every neuron. The proposed method was put to the 
test using the LIDC-IDRI database, and the results showed that it had a sensitivity of 90% on an FP/scan of 10 
[104]. Kuruvilla and Gunavathi [65] carried out yet another study, this time making use of the back propagation 
neural network. The findings of the study indicated that a sensitivity of 91.4% could be achieved with an 
FP/scan value of 30. 

A random forest method was proposed by Gong et al. [41] for the purpose of classifying lung nodules 
using the LUNA 16 and ANODE 09 databases. A sensitivity value of 79.3% was obtained from testing in both 
databases with an FP/scan value of 4, and a sensitivity value of 84.62% was obtained from testing with an 
FP/scan value of 2.8. Detecting lung cancer on CT images using the LIDC dataset was the focus of another 
study that was carried out by Jayaraj and Sathiamoorthy [54], which utilized the random forest method. The 
random forest method relies on a classification that takes into account both the index and the entropy to arrive 
at its conclusions. The results obtained using the proposed method had an accuracy of 89.90%, a sensitivity of 
90.85%, and a specificity of 88.32%, respectively. Nobrega et al. [57] conducted research on the effectiveness 
of deep transfer learning when applied to the classification of lung nodule malignancy tasks. Their goal was to 
improve such systems and put them to the test using the LIDC database. The proposed method is a contrast 
between deep transfer learning and deep feature learning, and it yields the following results: an area under the 
curve (AUC) of 93.1%, a true positive rate (TPR) of 85.38%, an evaluation metrics accuracy (ACC) of 88.41%, 
precision of 73.48%, and an Fl-score of 78.83%. They found that the deep transfer learning method is an 
effective way to take CT images of lung nodules and extract the most important features from those images. 
This was the finding of their research. 

The CNN classification method is utilized next in the FPs reduction stage. Shin et al. [44] developed 
the CNN approach for lung nodule identification, an architecture created with GoogLeNet that comprises of a 
convolution layer, three pooling layers, and nine inception layers. GoogLeNet's inception layers are made up 
of six convolutional layers and one pooling layer. The system was tested using the ILD dataset, and the findings 
indicated a reduced accuracy (79%). To identify lung CT images, Golan et al. [105] suggested a deep CNN 
technique utilizing a back-propagation algorithm. CNN is divided into two segments. The first section 
comprises of numerous volumetric convolutions, rectified linear units (ReLU), and max-pooling layers. The 
second component is a classifier composed of many fully linked, threshold, and softmax layers. The system 
was tested using the LIDC dataset and yielded a poor sensitivity result of 78.9% at 20 frames per second. 
Anthimopoulos ef al. devised and tested CNN for the categorization of ILD patterns. The suggested technique 
comprises of five convolutional layers, leaky ReLU activations, a pooling layer, and three dense layers. The 
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classification performance of lung patterns using the CNN approach is 85.5% [100]. Dou et al. proposed 
employing three-dimensional (3-D) CNN to improve automated lung nodule identification from volumetric 
CT data. The suggested approach was thoroughly tested in the LUNA16 challenge, yielding a sensitivity result 
of 90% with 8 FPs per scan [99]. Tekade and Rajeswari [17] presented lung cancer detection and classification 
using deep learning, and the technique is a 3D multipath VGG-like network that is tested on the LIDC, 
LUNA 16, and Kaggle datasets. The result for lung nodule identification and classification is 95.60% accuracy 
and 0.387732 log loss. Kido et al. [98] developed a CAD method for lung anomalies based on CNN and areas 
with CNN properties (R-CNN). R-CNN is an object detection system that use a CNN to categorize picture 
areas inside an image. R-CNN was trained using marked aberrant lesions, and it indicated abnormal lesion 
bounding boxes on the test picture. Their proposed approach has an accuracy of 84.7%. Jiang et al. [1] 
suggested an automated detection approach for lung nodules based on a multigroup patch and a deep learning 
network. With 15.1 FPs per scan, the CAD system achieved a result sensitivity of 94%. Huang et al. [50] 
proposed utilizing deep convolutional neural networks to detect and separate lung nodules in thoracic CT 
images in a quick and fully automated manner. The accuracy gained at the false-positive (FP) reduction step, 
which is conducted using CNN, is 94.6% with 4 FPs per scan. The average dice coefficient of nodule 
segmentation compared to the ground truth is 0.793. 

Lung nodule identification using multi-resolution convolutional networks was proposed by Li et al. 
[36] for the purpose of chest X-ray radiography. In order to extract the feature, they used patch-based multi- 
resolution convolutional networks, and for classification, they used four distinct fusion algorithms. They 
employed the JSRT database in order to evaluate their suggested technique. Within this database, they exhibited 
an accuracy of 99% while only using 0.2 FPs each scan. Kasinathan et al. [56] proposed utilizing CNN to 
automate the process of detecting and classifying three-dimensional lung tumors. The LIDC-IDRI dataset, 
which included 850 lung nodule-lesion pictures, was utilized in the evaluation of the suggested model. The 
outcome was that the model was accurate 97% of the time. In their study, Shi et al. [106] suggested a CNN 
multi-scale feature fusion approach for the identification of lung nodules. The framework for detection is made 
up of two parts: the production of region proposals and the minimization of false-positive results. The 
architecture of the CNN model is VGG16, and trials performed on the LUNA16 dataset demonstrate that it has 
an average sensitivity of 82.62%. Masood et al. suggested an automated technique for the identification of lung 
cancer by employing a method known as the improved multidimensional region-based fully convolutional 
network (mRFCN). The LIDC dataset was used to train and test their system, and the experiment findings 
demonstrate that their system has a sensitivity of 98.1% and an accuracy of 97.91% [97]. Wang et al. [96] 
proposed utilizing a raw patch-based CNN for the identification of lung nodules in CT images. On CT images 
taken from the LIDC-IDRI dataset, they evaluated the performance of ResNet in comparison to that of many 
alternative CNN architectures. As a result, they achieved a high detection sensitivity of 92.8% with 8 FPs per 
scan. 


8. CNN-BASED COMPUTER-AIDED LUNG NODULE DETECTION SYSTEM 

It is abundantly evident that the CAD system is undergoing continuous development year after year 
based on the investigations and analyses that we have outlined above for the automatic CAD detection system. 
This takes place so that a higher-quality nodule detection may be achieved, which ultimately results in 
increased efficiency. When it comes to identifying and categorizing lung nodules, the most effective CAD 
system is one that is capable of achieving high levels of both accuracy and sensitivity. We conducted a literature 
review on lung nodule detection and summarized many articles that were published between 2006 and 2022 in 
the databases Science Direct, Springer Link, IEEE Xplore, and Web of Science. Our goal was to identify 
potential research areas and future problems. It is crucial to analyze the framework and assessment process of 
the technique that has been proposed, so it is not the primary focus while reviewing some of the most recent 
research, even if it is important to directly compare the outcomes of the studies. On the other hand, we assessed 
the technique of lung nodule identification ina CAD system based on the data that was utilized, the number of 
nodules, the image size, and the best performance, which included sensitivity and the reduction of false-positive 
results. In addition, we analyzed and evaluated a number of different techniques for detecting lung nodules at 
each stage, including preprocessing, segmentation, nodule identification, and classification between nodules 
and non-nodules using feature extraction and FPs reduction. 

According to the findings of our review of the pertinent research literature, a number of studies 
examining the identification of lung nodules made use of a substantial number of datasets. CT scans are the 
most popular sort of dataset currently being used. Jiang et al. utilized the LIDC-IDRI dataset of 1006 images 
[1], Huang et al. used a total of 888 images [26], Kasinathan et al. used 850 images [56], Wang et al. [96], 
Masood et al. [97], and other researchers also used the same dataset. In addition to that, the LUNA16 dataset, 
which is utilized by a large number of people, is available. Gong et al. made use of the 1186-image LUNA16 
dataset [41], which was previously utilized by Tekade et al. [17], Dou et al. [99], Shi et al. [106], and a number 
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of other researchers. Table 1 provides an overview of a variety of datasets taken from several different public 
databases, in addition to providing information on these datasets. 

In addition to the dataset that was utilized, we also performed an analysis of the CAD system by 
employing the methods of preprocessing, segmentation, detection, and FPs reduction. The Table 2 provides a 
summary of some of the preprocessing and segmentation techniques that we use. The Gaussian filter and the 
Median Filter are the two types of preprocessing procedures that are utilized the most frequently. In order to 
analyze all 1018 photos that were taken from the LIDC-IDRI database, Jayaraj et al. utilized Gaussian and 
median filters. A gaussian filter is applied in order to reduce noise in lung cancer diagnosis, and a median filter 
is employed in order to remove salt and pepper noise, which is a type of very tiny noise that can be found in 
CT scans. In the CT picture of the lung, this will result in the smoothing of the image and a reduction in the 
speckle noise [54]. Widodo et al. [53], Filho et al. [58], and Sankar et al. [86] utilized the same method in their 
works. 

The subsequent step in the process carried out by a CAD system is image segmentation. The process 
of distinguishing lung nodules from other sections of a CT scan picture of the lung and then enhancing the 
image that is produced as a consequence in order to acquire additional detail is referred to as lung segmentation. 
The thresholding method is one of the most used approaches to segmentation [1], [55], [58]. Deep learning as 
a segmentation approach has developed over time and can take on a variety of forms, depending on the lung 
imaging nodule being analyzed. Using a faster convolutional neural network, Huang ef al. suggested a 
segmentation strategy for recognizing lung CT images [50], and they attained an accuracy of 94.6 f. RNN is a 
method that was proposed by Sankar and George [86] to enhance the identification of juxtapleural and 
juxtavascular lesions. UNet and CNN are two approaches that were proposed by Shaziya and colleagues for 
segmenting lung CT images. When it comes to picture segmentation, the findings of the Thedicee similarity 
coefficient (DSC) demonstrate that the UNet approach is 1.27 percent more effective than CNN [87]. In the 
TB category, Arora et al. performed segmentation with the help of the UNet, and as a result, they achieved a 
DSC value of 0.9680 [88]. As a direct consequence of this, deep learning segmentation strategies such as FCN, 
RNN, CNN, and UNet offer a great deal of promise. 

The procedure of segmentation is followed by the identification of nodules and the extraction of 
features. The purpose of this step is to ascertain whether or not a certain picture is recognized as a nodule. 
Using SVM’s and feature extraction with form and texture, Filho et al. [58] devised a method for the 
identification of nodules that was both highly accurate and sensitive. Tariq et al. [55] employ neurofuzzy to 
identify nodules, and they extract characteristics using vector and intensity. Back propagation neural network 
(BPNN) was utilized by Talebpour ef al. for the purpose of detecting nodules, in conjunction with geometric 
and textural feature extraction [104]. However, machine learning algorithms for detecting nodules and 
extracting features have difficulty with the work of defining and selecting the characteristics of a particular 
picture, which also causes the task to become more time-consuming. In the process of nodule detection, deep 
learning CNNs are utilized extensively. Several studies, such as the ones by Jiang et al. [1] using CNN, Tekade 
et al. [17] using CNN with 3D multipath VGG, Wang et al. using CNN (ResNet) [96], and Masood et al. [97] 
using RFCN, Dou et al. [99] using 3D-CNN, have achieved an accuracy and sensitivity that is more than 90%. 
Performing methods like as feature extraction and false-positive reduction are essential for achieving the best 
results possible in lung nodule detection. 

Moving forward, there is a pressing need for more study into the development of CAD systems for 
the identification of lung nodules. The target is to acquire a detection result that is more accurate as well as a 
high sensitivity value that has the potential to lower the value of FP reduction. The following is a list of 
significant concepts for potential CAD systems that can be utilized in the future to locate lung nodules: 

— Developing new methods of deep learning, such as the CNN method, with the primary goal of enhancing 
the performance of lung nodule identification. In addition, the batch normalization layer can be included 
in order to lessen the effects of overfitting and speed up the convergence of the network [96]. 

— Developing a CAD system that is capable of detecting all types of lung nodules with high accuracy and 
sensitivity, as well as a low percentage of false-positive results while maintaining these characteristics. 

— If the suggested technique is trained and evaluated using a large number of datasets, such as the LIDC- 
IDRI and LUNA16 public databases, then it will be able to provide a more comprehensive assessment of 
the general and clinical performance of the detection system. 


9. CONCLUSION 

In this study, we have presented a critical analysis of some of the research that has been done on CAD 
systems for lung nodule identification by doing literature investigations and analyzing the results. The research 
that has been done on CAD systems for lung nodule identification focuses on the identification of nodules in 
the lungs. There have been a few distinct lines of inquiry that have utilized CT scan photos to examine the 
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efficacy of the suggested method. After providing an overview of the computer-aided diagnosis (CAD) system 
for detecting lung nodules, it has been established that the system consists of various steps. These phases 
involve preparing or obtaining image data, preprocessing, segmenting, lung nodule detection, and FPs 
reduction, which incorporates feature extraction. Other steps include detecting and segmenting lung nodules. 
The steps are broken down into their component parts farther down in this article. We discovered that some of 
the works on lung nodule detection had better results than others based on parameters such as sensitivity, 
specificity, accuracy, and the number of FPs per scan, as well as other parameters, after reviewing a few of the 
most well-known works on the subject and evaluating the proposed method with a dataset taken from the 
LIDC-IDRI database. This was the conclusion we reached after looking at a few of the most well-known works 
on lung nodule detection and evaluating the method. In addition, we have provided an overview of the various 
methods that can be used for each of the steps of the process of identifying lung nodules. In recent years, deep 
learning algorithms such as CNN have been an increasingly popular means of locating prospective nodules and 
extracting attributes. Over the course of the last few years, this strategy has become increasingly common. 
Despite the fact that we discovered that some CAD systems achieved outstanding sensitivity with low false- 
positive rates, there are still a great many barriers to overcome in order to optimize CAD systems for the 
detection of lung cancer. It is expected that a capable CAD system will be able to assist radiologists in the 
detection of lung cancer. This is the single most critical thing that it ought to be able to accomplish. 
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